the 1996 Conference on Parallel Architectures and

نویسندگان

  • Po-Yung Chang
  • Marius Evers
  • Yale N. Patt
چکیده

Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Abstract Today's deeply pipelined, superscalar processors rely on accurate branch prediction in order to approach their performance potential. Branch mispredictions result in a ushing of the speculative information in the pipeline, thus limiting the amount of useful work that can be done. The 2-level branch predictors have been shown to achieve high prediction accuracy. However, it has also been shown that there is a high degree of pattern history table interference in 2-level branch predictors and that the interference generally has a negative eeect on the prediction accuracy. This paper introduces a method for reducing the pattern history table interference by dynamically identifying some easily predictable branches and inhibiting the pattern history table update for these branches. We show how this technique reduces pattern history table interference for two versions of the 2-level branch predictor and that this signiicantly improves branch prediction accuracy for the SPEC 95 benchmarks. In particular, we eliminate up to 30% of the mispredictions for the gcc benchmark.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing a new model for availability optimization applied to a series-parallel system (Quality Engineering Conference Paper)

Redundancy technique is known as a way to enhance the reliability and availability of non-reparable systems, but for repairable systems, another factor is getting prominent called as the number of maintenance resources. In this study, availability optimization of series-parallel systems is modelled by using Markovian process by which the number of maintenance resources is located into the obje...

متن کامل

Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs - Parallel Architectures and Compilation Techniques, 1996., Proceedings of the 1996 Conference on

This paper proposes an optimal algorithm for detecting fine or medium grain paralellism in nested loops whose dependences are described by an approximation of distance vectors by polyhedra. In particulal; it is optimal for direction vectors, which generalizes Wolf and Lam’s algorithm to the case of several statements. I t relies on a dependence uniformization process and on parallelization tech...

متن کامل

Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications

Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...

متن کامل

Parallel Algorithms for Fractal Image Coding on Mimd Architectures

In this paper parallel algorithms for fractal image coding on MIMD architectures are introduced and discussed. It turns out that the crucial point for the choice of a suitable parallelization strategy is the memory capacity of a processor element. Experimental results show a linear speedup of the proposed algorithms.

متن کامل

The Effect of Self-Assessment and Conference on EFL Students’ Production of Speech Acts and Politeness Markers: Alternatives on the Horizon?

Alternative assessment approaches received considerable attention soon after a discontent with traditional, one-shot testing. These approaches, however, have been used only to improve learners’ linguistic ability despite communicative models of language which pointed that knowledge of language also involves pragmatic ability (Bachman, 1990; Bachman & Palmer, 1996). The present study tries to ex...

متن کامل

Algorithms and Architectures for Parallel Processing, 10th International Conference, ICA3PP 2010, Busan, Korea, May 21-23, 2010. Proceedings. Part I

When there are many people who don't need to expect something more than the benefits to take, we will suggest you to have willing to reach all benefits. Be sure and surely do to take this algorithms and architectures for parallel processing 10th international conference ica3pp 2010 busan korea may 21 23 2010 proceedings part i computer science and general issues that gives the best reasons to r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996